Dataset statistics
| Number of variables | 15 |
|---|---|
| Number of observations | 838 |
| Missing cells | 0 |
| Missing cells (%) | 0.0% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 98.3 KiB |
| Average record size in memory | 120.2 B |
Variable types
| Numeric | 7 |
|---|---|
| Categorical | 8 |
Name has a high cardinality: 838 distinct values | High cardinality |
Age is highly overall correlated with age_labels | High correlation |
SibSp is highly overall correlated with FamilySize and 1 other fields | High correlation |
Parch is highly overall correlated with FamilySize and 1 other fields | High correlation |
Fare is highly overall correlated with FamilySize | High correlation |
FamilySize is highly overall correlated with SibSp and 3 other fields | High correlation |
age_labels is highly overall correlated with Age | High correlation |
Survived is highly overall correlated with Sex and 1 other fields | High correlation |
Sex is highly overall correlated with Survived and 1 other fields | High correlation |
Embarked is highly overall correlated with EmbarkedIndex | High correlation |
SexIndex is highly overall correlated with Survived and 1 other fields | High correlation |
EmbarkedIndex is highly overall correlated with Embarked | High correlation |
IsAlone is highly overall correlated with SibSp and 2 other fields | High correlation |
PassengerId is uniformly distributed | Uniform |
Name is uniformly distributed | Uniform |
PassengerId has unique values | Unique |
Name has unique values | Unique |
SibSp has 568 (67.8%) zeros | Zeros |
Parch has 638 (76.1%) zeros | Zeros |
Fare has 13 (1.6%) zeros | Zeros |
Reproduction
| Analysis started | 2023-07-27 09:17:24.666427 |
|---|---|
| Analysis finished | 2023-07-27 09:17:41.268125 |
| Duration | 16.6 seconds |
| Software version | pandas-profiling v3.6.6 |
| Download configuration | config.json |
PassengerId
Real number (ℝ)
UNIFORM  UNIQUE 
| Distinct | 838 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 447.13484 |
| Minimum | 1 |
|---|---|
| Maximum | 891 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 44.85 |
| Q1 | 225.25 |
| median | 448.5 |
| Q3 | 669.75 |
| 95-th percentile | 846.15 |
| Maximum | 891 |
| Range | 890 |
| Interquartile range (IQR) | 444.5 |
Descriptive statistics
| Standard deviation | 258.28327 |
|---|---|
| Coefficient of variation (CV) | 0.57764066 |
| Kurtosis | -1.1985332 |
| Mean | 447.13484 |
| Median Absolute Deviation (MAD) | 222.5 |
| Skewness | -0.012789643 |
| Sum | 374699 |
| Variance | 66710.246 |
| Monotonicity | Strictly increasing |
| Value | Count | Frequency (%) |
| 1 | 1 | 0.1% |
| 560 | 1 | 0.1% |
| 589 | 1 | 0.1% |
| 590 | 1 | 0.1% |
| 591 | 1 | 0.1% |
| 592 | 1 | 0.1% |
| 593 | 1 | 0.1% |
| 594 | 1 | 0.1% |
| 595 | 1 | 0.1% |
| 596 | 1 | 0.1% |
| Other values (828) | 828 |
| Value | Count | Frequency (%) |
| 1 | 1 | |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 5 | 1 | |
| 6 | 1 | |
| 7 | 1 | |
| 8 | 1 | |
| 9 | 1 | |
| 10 | 1 |
| Value | Count | Frequency (%) |
| 891 | 1 | |
| 890 | 1 | |
| 888 | 1 | |
| 887 | 1 | |
| 886 | 1 | |
| 885 | 1 | |
| 884 | 1 | |
| 883 | 1 | |
| 882 | 1 | |
| 881 | 1 |
Survived
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| 0 | |
|---|---|
| 1 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 838 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 1 |
| 3rd row | 1 |
| 4th row | 1 |
| 5th row | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 534 | |
| 1 | 304 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0 | 534 | |
| 1 | 304 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 534 | |
| 1 | 304 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 838 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 534 | |
| 1 | 304 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 838 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 534 | |
| 1 | 304 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 838 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 534 | |
| 1 | 304 |
Pclass
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| 3 | |
|---|---|
| 1 | |
| 2 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 838 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 3 |
|---|---|
| 2nd row | 1 |
| 3rd row | 3 |
| 4th row | 1 |
| 5th row | 3 |
Common Values
| Value | Count | Frequency (%) |
| 3 | 460 | |
| 1 | 207 | |
| 2 | 171 | 20.4% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 3 | 460 | |
| 1 | 207 | |
| 2 | 171 | 20.4% |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 460 | |
| 1 | 207 | |
| 2 | 171 | 20.4% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 838 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 3 | 460 | |
| 1 | 207 | |
| 2 | 171 | 20.4% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 838 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 3 | 460 | |
| 1 | 207 | |
| 2 | 171 | 20.4% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 838 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 3 | 460 | |
| 1 | 207 | |
| 2 | 171 | 20.4% |
Name
Categorical
HIGH CARDINALITY  UNIFORM  UNIQUE 
| Distinct | 838 |
|---|---|
| Distinct (%) | 100.0% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| Braund, Mr. Owen Harris | 1 |
|---|---|
| de Messemaeker, Mrs. Guillaume Joseph (Emma) | 1 |
| Gilinski, Mr. Eliezer | 1 |
| Murdlin, Mr. Joseph | 1 |
| Rintamaki, Mr. Matti | 1 |
| Other values (833) |
Length
| Max length | 82 |
|---|---|
| Median length | 51 |
| Mean length | 26.386635 |
| Min length | 12 |
Characters and Unicode
| Total characters | 22112 |
|---|---|
| Distinct characters | 59 |
| Distinct categories | 7 ? |
| Distinct scripts | 2 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 838 ? |
|---|---|
| Unique (%) | 100.0% |
Sample
| 1st row | Braund, Mr. Owen Harris |
|---|---|
| 2nd row | Cumings, Mrs. John Bradley (Florence Briggs Thayer) |
| 3rd row | Heikkinen, Miss. Laina |
| 4th row | Futrelle, Mrs. Jacques Heath (Lily May Peel) |
| 5th row | Allen, Mr. William Henry |
Common Values
| Value | Count | Frequency (%) |
| Braund, Mr. Owen Harris | 1 | 0.1% |
| de Messemaeker, Mrs. Guillaume Joseph (Emma) | 1 | 0.1% |
| Gilinski, Mr. Eliezer | 1 | 0.1% |
| Murdlin, Mr. Joseph | 1 | 0.1% |
| Rintamaki, Mr. Matti | 1 | 0.1% |
| Stephenson, Mrs. Walter Bertram (Martha Eustis) | 1 | 0.1% |
| Elsbury, Mr. William James | 1 | 0.1% |
| Bourke, Miss. Mary | 1 | 0.1% |
| Chapman, Mr. John Henry | 1 | 0.1% |
| Van Impe, Mr. Jean Baptiste | 1 | 0.1% |
| Other values (828) | 828 |
Length
| Value | Count | Frequency (%) |
| mr | 501 | 15.0% |
| miss | 156 | 4.7% |
| mrs | 121 | 3.6% |
| william | 59 | 1.8% |
| john | 40 | 1.2% |
| master | 36 | 1.1% |
| henry | 32 | 1.0% |
| james | 23 | 0.7% |
| charles | 22 | 0.7% |
| thomas | 21 | 0.6% |
| Other values (1425) | 2340 |
Most occurring characters
| Value | Count | Frequency (%) |
| 2515 | 11.4% | |
| r | 1827 | 8.3% |
| e | 1560 | 7.1% |
| a | 1538 | 7.0% |
| i | 1209 | 5.5% |
| n | 1200 | 5.4% |
| s | 1192 | 5.4% |
| M | 1044 | 4.7% |
| l | 980 | 4.4% |
| o | 932 | 4.2% |
| Other values (49) | 8115 |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 14285 | |
| Uppercase Letter | 3369 | 15.2% |
| Space Separator | 2515 | 11.4% |
| Other Punctuation | 1684 | 7.6% |
| Close Punctuation | 123 | 0.6% |
| Open Punctuation | 123 | 0.6% |
| Dash Punctuation | 13 | 0.1% |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| r | 1827 | |
| e | 1560 | |
| a | 1538 | |
| i | 1209 | |
| n | 1200 | |
| s | 1192 | |
| l | 980 | 6.9% |
| o | 932 | 6.5% |
| t | 616 | 4.3% |
| h | 476 | 3.3% |
| Other values (16) | 2755 |
Uppercase Letter
| Value | Count | Frequency (%) |
| M | 1044 | |
| A | 232 | 6.9% |
| J | 206 | 6.1% |
| H | 177 | 5.3% |
| S | 174 | 5.2% |
| C | 160 | 4.7% |
| E | 155 | 4.6% |
| W | 133 | 3.9% |
| B | 132 | 3.9% |
| L | 119 | 3.5% |
| Other values (15) | 837 |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 839 | |
| , | 838 | |
| ' | 6 | 0.4% |
| / | 1 | 0.1% |
Space Separator
| Value | Count | Frequency (%) |
| 2515 |
Close Punctuation
| Value | Count | Frequency (%) |
| ) | 123 |
Open Punctuation
| Value | Count | Frequency (%) |
| ( | 123 |
Dash Punctuation
| Value | Count | Frequency (%) |
| - | 13 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 17654 | |
| Common | 4458 | 20.2% |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| r | 1827 | 10.3% |
| e | 1560 | 8.8% |
| a | 1538 | 8.7% |
| i | 1209 | 6.8% |
| n | 1200 | 6.8% |
| s | 1192 | 6.8% |
| M | 1044 | 5.9% |
| l | 980 | 5.6% |
| o | 932 | 5.3% |
| t | 616 | 3.5% |
| Other values (41) | 5556 |
Common
| Value | Count | Frequency (%) |
| 2515 | ||
| . | 839 | 18.8% |
| , | 838 | 18.8% |
| ) | 123 | 2.8% |
| ( | 123 | 2.8% |
| - | 13 | 0.3% |
| ' | 6 | 0.1% |
| / | 1 | < 0.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 22112 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 2515 | 11.4% | |
| r | 1827 | 8.3% |
| e | 1560 | 7.1% |
| a | 1538 | 7.0% |
| i | 1209 | 5.5% |
| n | 1200 | 5.4% |
| s | 1192 | 5.4% |
| M | 1044 | 4.7% |
| l | 980 | 4.4% |
| o | 932 | 4.2% |
| Other values (49) | 8115 |
Sex
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| male | |
|---|---|
| female |
Length
| Max length | 6 |
|---|---|
| Median length | 4 |
| Mean length | 4.673031 |
| Min length | 4 |
Characters and Unicode
| Total characters | 3916 |
|---|---|
| Distinct characters | 5 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | male |
|---|---|
| 2nd row | female |
| 3rd row | female |
| 4th row | female |
| 5th row | male |
Common Values
| Value | Count | Frequency (%) |
| male | 556 | |
| female | 282 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| male | 556 | |
| female | 282 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 1120 | |
| m | 838 | |
| a | 838 | |
| l | 838 | |
| f | 282 | 7.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Lowercase Letter | 3916 |
Most frequent character per category
Lowercase Letter
| Value | Count | Frequency (%) |
| e | 1120 | |
| m | 838 | |
| a | 838 | |
| l | 838 | |
| f | 282 | 7.2% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 3916 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| e | 1120 | |
| m | 838 | |
| a | 838 | |
| l | 838 | |
| f | 282 | 7.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 3916 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| e | 1120 | |
| m | 838 | |
| a | 838 | |
| l | 838 | |
| f | 282 | 7.2% |
Age
Real number (ℝ)
| Distinct | 87 |
|---|---|
| Distinct (%) | 10.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 29.950274 |
| Minimum | 0.42 |
|---|---|
| Maximum | 80 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 0.42 |
|---|---|
| 5-th percentile | 5.85 |
| Q1 | 22 |
| median | 29.699118 |
| Q3 | 35 |
| 95-th percentile | 54.15 |
| Maximum | 80 |
| Range | 79.58 |
| Interquartile range (IQR) | 13 |
Descriptive statistics
| Standard deviation | 13.0862 |
|---|---|
| Coefficient of variation (CV) | 0.43693088 |
| Kurtosis | 0.95579935 |
| Mean | 29.950274 |
| Median Absolute Deviation (MAD) | 6.3008824 |
| Skewness | 0.443171 |
| Sum | 25098.33 |
| Variance | 171.24862 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 29.69911765 | 159 | 19.0% |
| 24 | 27 | 3.2% |
| 22 | 26 | 3.1% |
| 30 | 25 | 3.0% |
| 28 | 25 | 3.0% |
| 18 | 24 | 2.9% |
| 19 | 24 | 2.9% |
| 25 | 23 | 2.7% |
| 21 | 21 | 2.5% |
| 36 | 20 | 2.4% |
| Other values (77) | 464 |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.1% |
| 0.67 | 1 | 0.1% |
| 0.75 | 2 | 0.2% |
| 0.83 | 2 | 0.2% |
| 0.92 | 1 | 0.1% |
| 1 | 6 | |
| 2 | 10 | |
| 3 | 5 | |
| 4 | 10 | |
| 5 | 4 | 0.5% |
| Value | Count | Frequency (%) |
| 80 | 1 | 0.1% |
| 74 | 1 | 0.1% |
| 71 | 2 | |
| 70.5 | 1 | 0.1% |
| 70 | 2 | |
| 66 | 1 | 0.1% |
| 65 | 3 | |
| 64 | 2 | |
| 63 | 2 | |
| 62 | 4 |
SibSp
Real number (ℝ)
HIGH CORRELATION  ZEROS 
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.52863962 |
| Minimum | 0 |
|---|---|
| Maximum | 8 |
| Zeros | 568 |
| Zeros (%) | 67.8% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 1 |
| 95-th percentile | 3 |
| Maximum | 8 |
| Range | 8 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.09676 |
|---|---|
| Coefficient of variation (CV) | 2.0746837 |
| Kurtosis | 17.012651 |
| Mean | 0.52863962 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 3.5952151 |
| Sum | 443 |
| Variance | 1.2028825 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 568 | |
| 1 | 200 | 23.9% |
| 2 | 25 | 3.0% |
| 4 | 18 | 2.1% |
| 3 | 16 | 1.9% |
| 8 | 6 | 0.7% |
| 5 | 5 | 0.6% |
| Value | Count | Frequency (%) |
| 0 | 568 | |
| 1 | 200 | 23.9% |
| 2 | 25 | 3.0% |
| 3 | 16 | 1.9% |
| 4 | 18 | 2.1% |
| 5 | 5 | 0.6% |
| 8 | 6 | 0.7% |
| Value | Count | Frequency (%) |
| 8 | 6 | 0.7% |
| 5 | 5 | 0.6% |
| 4 | 18 | 2.1% |
| 3 | 16 | 1.9% |
| 2 | 25 | 3.0% |
| 1 | 200 | 23.9% |
| 0 | 568 |
Parch
Real number (ℝ)
HIGH CORRELATION  ZEROS 
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 0.37947494 |
| Minimum | 0 |
|---|---|
| Maximum | 6 |
| Zeros | 638 |
| Zeros (%) | 76.1% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 0 |
| Q1 | 0 |
| median | 0 |
| Q3 | 0 |
| 95-th percentile | 2 |
| Maximum | 6 |
| Range | 6 |
| Interquartile range (IQR) | 0 |
Descriptive statistics
| Standard deviation | 0.80865019 |
|---|---|
| Coefficient of variation (CV) | 2.1309713 |
| Kurtosis | 10.271277 |
| Mean | 0.37947494 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.8199432 |
| Sum | 318 |
| Variance | 0.65391514 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 638 | |
| 1 | 114 | 13.6% |
| 2 | 71 | 8.5% |
| 5 | 5 | 0.6% |
| 3 | 5 | 0.6% |
| 4 | 4 | 0.5% |
| 6 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 0 | 638 | |
| 1 | 114 | 13.6% |
| 2 | 71 | 8.5% |
| 3 | 5 | 0.6% |
| 4 | 4 | 0.5% |
| 5 | 5 | 0.6% |
| 6 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 6 | 1 | 0.1% |
| 5 | 5 | 0.6% |
| 4 | 4 | 0.5% |
| 3 | 5 | 0.6% |
| 2 | 71 | 8.5% |
| 1 | 114 | 13.6% |
| 0 | 638 |
Fare
Real number (ℝ)
HIGH CORRELATION  ZEROS 
| Distinct | 244 |
|---|---|
| Distinct (%) | 29.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 32.736773 |
| Minimum | 0 |
|---|---|
| Maximum | 512.3292 |
| Zeros | 13 |
| Zeros (%) | 1.6% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 7.225 |
| Q1 | 7.925 |
| median | 14.4542 |
| Q3 | 31.275 |
| 95-th percentile | 113.275 |
| Maximum | 512.3292 |
| Range | 512.3292 |
| Interquartile range (IQR) | 23.35 |
Descriptive statistics
| Standard deviation | 50.354437 |
|---|---|
| Coefficient of variation (CV) | 1.5381613 |
| Kurtosis | 32.994321 |
| Mean | 32.736773 |
| Median Absolute Deviation (MAD) | 6.9459 |
| Skewness | 4.7494031 |
| Sum | 27433.416 |
| Variance | 2535.5693 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 8.05 | 41 | 4.9% |
| 13 | 40 | 4.8% |
| 7.8958 | 37 | 4.4% |
| 7.75 | 29 | 3.5% |
| 26 | 27 | 3.2% |
| 10.5 | 23 | 2.7% |
| 7.925 | 18 | 2.1% |
| 7.775 | 16 | 1.9% |
| 7.2292 | 15 | 1.8% |
| 8.6625 | 13 | 1.6% |
| Other values (234) | 579 |
| Value | Count | Frequency (%) |
| 0 | 13 | |
| 4.0125 | 1 | 0.1% |
| 5 | 1 | 0.1% |
| 6.2375 | 1 | 0.1% |
| 6.4375 | 1 | 0.1% |
| 6.45 | 1 | 0.1% |
| 6.4958 | 2 | 0.2% |
| 6.75 | 1 | 0.1% |
| 6.8583 | 1 | 0.1% |
| 6.95 | 1 | 0.1% |
| Value | Count | Frequency (%) |
| 512.3292 | 3 | |
| 263 | 4 | |
| 262.375 | 1 | 0.1% |
| 247.5208 | 2 | |
| 227.525 | 4 | |
| 221.7792 | 1 | 0.1% |
| 211.5 | 1 | 0.1% |
| 211.3375 | 3 | |
| 164.8667 | 2 | |
| 153.4625 | 3 |
Embarked
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| S | |
|---|---|
| C | |
| Q |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 838 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | S |
|---|---|
| 2nd row | C |
| 3rd row | S |
| 4th row | S |
| 5th row | S |
Common Values
| Value | Count | Frequency (%) |
| S | 616 | |
| C | 159 | 19.0% |
| Q | 63 | 7.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| s | 616 | |
| c | 159 | 19.0% |
| q | 63 | 7.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 616 | |
| C | 159 | 19.0% |
| Q | 63 | 7.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Uppercase Letter | 838 |
Most frequent character per category
Uppercase Letter
| Value | Count | Frequency (%) |
| S | 616 | |
| C | 159 | 19.0% |
| Q | 63 | 7.5% |
Most occurring scripts
| Value | Count | Frequency (%) |
| Latin | 838 |
Most frequent character per script
Latin
| Value | Count | Frequency (%) |
| S | 616 | |
| C | 159 | 19.0% |
| Q | 63 | 7.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 838 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| S | 616 | |
| C | 159 | 19.0% |
| Q | 63 | 7.5% |
SexIndex
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| 0.0 | |
|---|---|
| 1.0 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 2514 |
|---|---|
| Distinct characters | 3 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 1.0 |
| 4th row | 1.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 556 | |
| 1.0 | 282 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0.0 | 556 | |
| 1.0 | 282 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1394 | |
| . | 838 | |
| 1 | 282 | 11.2% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1676 | |
| Other Punctuation | 838 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1394 | |
| 1 | 282 | 16.8% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 838 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2514 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1394 | |
| . | 838 | |
| 1 | 282 | 11.2% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2514 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1394 | |
| . | 838 | |
| 1 | 282 | 11.2% |
EmbarkedIndex
Categorical
| Distinct | 3 |
|---|---|
| Distinct (%) | 0.4% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| 0.0 | |
|---|---|
| 1.0 | |
| 2.0 |
Length
| Max length | 3 |
|---|---|
| Median length | 3 |
| Mean length | 3 |
| Min length | 3 |
Characters and Unicode
| Total characters | 2514 |
|---|---|
| Distinct characters | 4 |
| Distinct categories | 2 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0.0 |
|---|---|
| 2nd row | 1.0 |
| 3rd row | 0.0 |
| 4th row | 0.0 |
| 5th row | 0.0 |
Common Values
| Value | Count | Frequency (%) |
| 0.0 | 616 | |
| 1.0 | 159 | 19.0% |
| 2.0 | 63 | 7.5% |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 0.0 | 616 | |
| 1.0 | 159 | 19.0% |
| 2.0 | 63 | 7.5% |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 1454 | |
| . | 838 | |
| 1 | 159 | 6.3% |
| 2 | 63 | 2.5% |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 1676 | |
| Other Punctuation | 838 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 0 | 1454 | |
| 1 | 159 | 9.5% |
| 2 | 63 | 3.8% |
Other Punctuation
| Value | Count | Frequency (%) |
| . | 838 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 2514 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 0 | 1454 | |
| . | 838 | |
| 1 | 159 | 6.3% |
| 2 | 63 | 2.5% |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 2514 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 0 | 1454 | |
| . | 838 | |
| 1 | 159 | 6.3% |
| 2 | 63 | 2.5% |
FamilySize
Real number (ℝ)
| Distinct | 9 |
|---|---|
| Distinct (%) | 1.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1.9081146 |
| Minimum | 1 |
|---|---|
| Maximum | 11 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 1 |
| median | 1 |
| Q3 | 2 |
| 95-th percentile | 6 |
| Maximum | 11 |
| Range | 10 |
| Interquartile range (IQR) | 1 |
Descriptive statistics
| Standard deviation | 1.6078975 |
|---|---|
| Coefficient of variation (CV) | 0.84266297 |
| Kurtosis | 8.800414 |
| Mean | 1.9081146 |
| Median Absolute Deviation (MAD) | 0 |
| Skewness | 2.6839454 |
| Sum | 1599 |
| Variance | 2.5853343 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 1 | 502 | |
| 2 | 155 | 18.5% |
| 3 | 95 | 11.3% |
| 4 | 28 | 3.3% |
| 6 | 22 | 2.6% |
| 5 | 12 | 1.4% |
| 7 | 12 | 1.4% |
| 8 | 6 | 0.7% |
| 11 | 6 | 0.7% |
| Value | Count | Frequency (%) |
| 1 | 502 | |
| 2 | 155 | 18.5% |
| 3 | 95 | 11.3% |
| 4 | 28 | 3.3% |
| 5 | 12 | 1.4% |
| 6 | 22 | 2.6% |
| 7 | 12 | 1.4% |
| 8 | 6 | 0.7% |
| 11 | 6 | 0.7% |
| Value | Count | Frequency (%) |
| 11 | 6 | 0.7% |
| 8 | 6 | 0.7% |
| 7 | 12 | 1.4% |
| 6 | 22 | 2.6% |
| 5 | 12 | 1.4% |
| 4 | 28 | 3.3% |
| 3 | 95 | 11.3% |
| 2 | 155 | 18.5% |
| 1 | 502 |
IsAlone
Categorical
| Distinct | 2 |
|---|---|
| Distinct (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.7 KiB |
| 1 | |
|---|---|
| 0 |
Length
| Max length | 1 |
|---|---|
| Median length | 1 |
| Mean length | 1 |
| Min length | 1 |
Characters and Unicode
| Total characters | 838 |
|---|---|
| Distinct characters | 2 |
| Distinct categories | 1 ? |
| Distinct scripts | 1 ? |
| Distinct blocks | 1 ? |
Unique
| Unique | 0 ? |
|---|---|
| Unique (%) | 0.0% |
Sample
| 1st row | 0 |
|---|---|
| 2nd row | 0 |
| 3rd row | 1 |
| 4th row | 0 |
| 5th row | 1 |
Common Values
| Value | Count | Frequency (%) |
| 1 | 502 | |
| 0 | 336 |
Length
Common Values (Plot)
| Value | Count | Frequency (%) |
| 1 | 502 | |
| 0 | 336 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1 | 502 | |
| 0 | 336 |
Most occurring categories
| Value | Count | Frequency (%) |
| Decimal Number | 838 |
Most frequent character per category
Decimal Number
| Value | Count | Frequency (%) |
| 1 | 502 | |
| 0 | 336 |
Most occurring scripts
| Value | Count | Frequency (%) |
| Common | 838 |
Most frequent character per script
Common
| Value | Count | Frequency (%) |
| 1 | 502 | |
| 0 | 336 |
Most occurring blocks
| Value | Count | Frequency (%) |
| ASCII | 838 |
Most frequent character per block
ASCII
| Value | Count | Frequency (%) |
| 1 | 502 | |
| 0 | 336 |
age_labels
Real number (ℝ)
| Distinct | 7 |
|---|---|
| Distinct (%) | 0.8% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 4.7875895 |
| Minimum | 1 |
|---|---|
| Maximum | 7 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Negative | 0 |
| Negative (%) | 0.0% |
| Memory size | 6.7 KiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 2 |
| Q1 | 4 |
| median | 5 |
| Q3 | 6 |
| 95-th percentile | 6 |
| Maximum | 7 |
| Range | 6 |
| Interquartile range (IQR) | 2 |
Descriptive statistics
| Standard deviation | 1.2794632 |
|---|---|
| Coefficient of variation (CV) | 0.26724581 |
| Kurtosis | 1.8073897 |
| Mean | 4.7875895 |
| Median Absolute Deviation (MAD) | 1 |
| Skewness | -1.268688 |
| Sum | 4012 |
| Variance | 1.6370262 |
| Monotonicity | Not monotonic |
| Value | Count | Frequency (%) |
| 5 | 385 | |
| 6 | 199 | |
| 4 | 126 | 15.0% |
| 3 | 41 | 4.9% |
| 1 | 38 | 4.5% |
| 7 | 26 | 3.1% |
| 2 | 23 | 2.7% |
| Value | Count | Frequency (%) |
| 1 | 38 | 4.5% |
| 2 | 23 | 2.7% |
| 3 | 41 | 4.9% |
| 4 | 126 | 15.0% |
| 5 | 385 | |
| 6 | 199 | |
| 7 | 26 | 3.1% |
| Value | Count | Frequency (%) |
| 7 | 26 | 3.1% |
| 6 | 199 | |
| 5 | 385 | |
| 4 | 126 | 15.0% |
| 3 | 41 | 4.9% |
| 2 | 23 | 2.7% |
| 1 | 38 | 4.5% |
| PassengerId | Age | SibSp | Parch | Fare | FamilySize | age_labels | Survived | Pclass | Sex | Embarked | SexIndex | EmbarkedIndex | IsAlone | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| PassengerId | 1.000 | 0.034 | -0.078 | -0.005 | -0.028 | -0.061 | 0.027 | 0.110 | 0.028 | 0.072 | 0.000 | 0.072 | 0.000 | 0.033 |
| Age | 0.034 | 1.000 | -0.159 | -0.206 | 0.116 | -0.188 | 0.945 | 0.157 | 0.253 | 0.101 | 0.144 | 0.101 | 0.144 | 0.346 |
| SibSp | -0.078 | -0.159 | 1.000 | 0.447 | 0.442 | 0.852 | -0.152 | 0.195 | 0.157 | 0.224 | 0.092 | 0.224 | 0.092 | 0.839 |
| Parch | -0.005 | -0.206 | 0.447 | 1.000 | 0.407 | 0.796 | -0.201 | 0.173 | 0.034 | 0.262 | 0.028 | 0.262 | 0.028 | 0.680 |
| Fare | -0.028 | 0.116 | 0.442 | 0.407 | 1.000 | 0.525 | 0.120 | 0.313 | 0.485 | 0.206 | 0.195 | 0.206 | 0.195 | 0.306 |
| FamilySize | -0.061 | -0.188 | 0.852 | 0.796 | 0.525 | 1.000 | -0.174 | 0.225 | 0.145 | 0.213 | 0.083 | 0.213 | 0.083 | 0.635 |
| age_labels | 0.027 | 0.945 | -0.152 | -0.201 | 0.120 | -0.174 | 1.000 | 0.132 | 0.247 | 0.101 | 0.113 | 0.101 | 0.113 | 0.349 |
| Survived | 0.110 | 0.157 | 0.195 | 0.173 | 0.313 | 0.225 | 0.132 | 1.000 | 0.356 | 0.547 | 0.161 | 0.547 | 0.161 | 0.218 |
| Pclass | 0.028 | 0.253 | 0.157 | 0.034 | 0.485 | 0.145 | 0.247 | 0.356 | 1.000 | 0.146 | 0.242 | 0.146 | 0.242 | 0.136 |
| Sex | 0.072 | 0.101 | 0.224 | 0.262 | 0.206 | 0.213 | 0.101 | 0.547 | 0.146 | 1.000 | 0.085 | 0.997 | 0.085 | 0.320 |
| Embarked | 0.000 | 0.144 | 0.092 | 0.028 | 0.195 | 0.083 | 0.113 | 0.161 | 0.242 | 0.085 | 1.000 | 0.085 | 1.000 | 0.092 |
| SexIndex | 0.072 | 0.101 | 0.224 | 0.262 | 0.206 | 0.213 | 0.101 | 0.547 | 0.146 | 0.997 | 0.085 | 1.000 | 0.085 | 0.320 |
| EmbarkedIndex | 0.000 | 0.144 | 0.092 | 0.028 | 0.195 | 0.083 | 0.113 | 0.161 | 0.242 | 0.085 | 1.000 | 0.085 | 1.000 | 0.092 |
| IsAlone | 0.033 | 0.346 | 0.839 | 0.680 | 0.306 | 0.635 | 0.349 | 0.218 | 0.136 | 0.320 | 0.092 | 0.320 | 0.092 | 1.000 |
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Fare | Embarked | SexIndex | EmbarkedIndex | FamilySize | IsAlone | age_labels | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 1 | 0 | 3 | Braund, Mr. Owen Harris | male | 22.000000 | 1.0 | 0 | 7.2500 | S | 0.0 | 0.0 | 2.0 | 0 | 4.0 |
| 1 | 2 | 1 | 1 | Cumings, Mrs. John Bradley (Florence Briggs Thayer) | female | 38.000000 | 1.0 | 0 | 71.2833 | C | 1.0 | 1.0 | 2.0 | 0 | 6.0 |
| 2 | 3 | 1 | 3 | Heikkinen, Miss. Laina | female | 26.000000 | 0.0 | 0 | 7.9250 | S | 1.0 | 0.0 | 1.0 | 1 | 5.0 |
| 3 | 4 | 1 | 1 | Futrelle, Mrs. Jacques Heath (Lily May Peel) | female | 35.000000 | 1.0 | 0 | 53.1000 | S | 1.0 | 0.0 | 2.0 | 0 | 6.0 |
| 4 | 5 | 0 | 3 | Allen, Mr. William Henry | male | 35.000000 | 0.0 | 0 | 8.0500 | S | 0.0 | 0.0 | 1.0 | 1 | 6.0 |
| 5 | 6 | 0 | 3 | Moran, Mr. James | male | 29.699118 | 0.0 | 0 | 8.4583 | Q | 0.0 | 2.0 | 1.0 | 1 | 5.0 |
| 6 | 7 | 0 | 1 | McCarthy, Mr. Timothy J | male | 54.000000 | 0.0 | 0 | 51.8625 | S | 0.0 | 0.0 | 1.0 | 1 | 6.0 |
| 7 | 8 | 0 | 3 | Palsson, Master. Gosta Leonard | male | 2.000000 | 3.0 | 1 | 21.0750 | S | 0.0 | 0.0 | 5.0 | 0 | 1.0 |
| 8 | 9 | 1 | 3 | Johnson, Mrs. Oscar W (Elisabeth Vilhelmina Berg) | female | 27.000000 | 0.0 | 2 | 11.1333 | S | 1.0 | 0.0 | 3.0 | 0 | 5.0 |
| 9 | 10 | 1 | 2 | Nasser, Mrs. Nicholas (Adele Achem) | female | 14.000000 | 1.0 | 0 | 30.0708 | C | 1.0 | 1.0 | 2.0 | 0 | 3.0 |
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Fare | Embarked | SexIndex | EmbarkedIndex | FamilySize | IsAlone | age_labels | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 828 | 881 | 1 | 2 | Shelley, Mrs. William (Imanita Parrish Hall) | female | 25.0 | 0.0 | 1 | 26.0000 | S | 1.0 | 0.0 | 2.0 | 0 | 5.0 |
| 829 | 882 | 0 | 3 | Markun, Mr. Johann | male | 33.0 | 0.0 | 0 | 7.8958 | S | 0.0 | 0.0 | 1.0 | 1 | 5.0 |
| 830 | 883 | 0 | 3 | Dahlberg, Miss. Gerda Ulrika | female | 22.0 | 0.0 | 0 | 10.5167 | S | 1.0 | 0.0 | 1.0 | 1 | 4.0 |
| 831 | 884 | 0 | 2 | Banfield, Mr. Frederick James | male | 28.0 | 0.0 | 0 | 10.5000 | S | 0.0 | 0.0 | 1.0 | 1 | 5.0 |
| 832 | 885 | 0 | 3 | Sutehall, Mr. Henry Jr | male | 25.0 | 0.0 | 0 | 7.0500 | S | 0.0 | 0.0 | 1.0 | 1 | 5.0 |
| 833 | 886 | 0 | 3 | Rice, Mrs. William (Margaret Norton) | female | 39.0 | 0.0 | 5 | 29.1250 | Q | 1.0 | 2.0 | 6.0 | 0 | 6.0 |
| 834 | 887 | 0 | 2 | Montvila, Rev. Juozas | male | 27.0 | 0.0 | 0 | 13.0000 | S | 0.0 | 0.0 | 1.0 | 1 | 5.0 |
| 835 | 888 | 1 | 1 | Graham, Miss. Margaret Edith | female | 19.0 | 0.0 | 0 | 30.0000 | S | 1.0 | 0.0 | 1.0 | 1 | 4.0 |
| 836 | 890 | 1 | 1 | Behr, Mr. Karl Howell | male | 26.0 | 0.0 | 0 | 30.0000 | C | 0.0 | 1.0 | 1.0 | 1 | 5.0 |
| 837 | 891 | 0 | 3 | Dooley, Mr. Patrick | male | 32.0 | 0.0 | 0 | 7.7500 | Q | 0.0 | 2.0 | 1.0 | 1 | 5.0 |